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Abstract 

Development of a highly reproducible and sensitive single-cell RNA sequencing (RNA-seq) method would facilitate 
the understanding of the biological roles and underlying mechanisms of non-genetic cellular heterogeneity. In this 
study, we report a novel single-cell RNA-seq method called Quartz-Seq that has a simpler protocol and higher 
reproducibility and sensitivity than existing methods. We show that single-cell Quartz-Seq can quantitatively detect 
various kinds of non-genetic cellular heterogeneity, and can detect different cell types and different cell-cycle 
phases of a single cell type. Moreover, this method can comprehensively reveal gene-expression heterogeneity 
between single cells of the same cell type in the same cell-cycle phase. 
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Background 

Non-genetic cellular heterogeneity at the mRNA and pro- 
tein levels has been observed within cell populations in 
diverse developmental processes and physiological condi- 
tions [1-4]. However, the comprehensive and quantitative 
analysis of this cellular heterogeneity and its changes in 
response to perturbations has been extremely challenging. 
Recently, several researchers reported quantification of 
gene-expression heterogeneity within genetically identical 
cell populations, and elucidation of its biological roles and 
underlying mechanisms [5-8]. Although gene-expression 
heterogeneities have been quantitatively measured for sev- 
eral target genes using single-molecule imaging or single- 
cell quantitative (q)PCR, comprehensive studies on the 
quantification of gene-expression heterogeneity are limited 
[9] and thus further work is required. Because global 
gene-expression heterogeneity may provide biological 
information (for example, on cell fate, culture environ- 
ment, and drug response), the question of how to compre- 
hensively and quantitatively detect the heterogeneity of 
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mRNA expression in single cells and how to extract biolo- 
gical information from those data remains to be addressed. 

Single-cell RNA sequencing (RNA-seq) analysis has 
been shown to be an effective approach for the compre- 
hensive quantification of gene-expression heterogeneity 
that reflects the cellular heterogeneity at the single-cell 
level [10,11]. To understand the biological roles and 
underlying mechanisms of such heterogeneity, an ideal 
single-cell transcriptome analysis method would provide 
a simple, highly reproducible, and sensitive method for 
measuring the gene-expression heterogeneity of cell 
populations. In addition, this method should be able to 
distinguish clearly the gene-expression heterogeneity 
from experimental errors. 

Single-cell transcriptome analyses, which can be 
achieved through the use of various platforms, such as 
microarrays, massively parallel sequencers and bead arrays 
[12-17], are able to identify cell-type markers and/or rare 
cell types in tissues. These platforms require nanogram 
quantities of DNA as the starting material. However, a 
typical single cell has approximately 10 pg of total RNA 
and often contains only 0.1 pg of polyadenylated RNA, 
hence, o obtain the amount of DNA starting material that 
is required by these platforms, it is necessary to perform 
whole- transcript amplification (WTA). 
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Previous WTA methods for single cells fall into two 
categories, based on the modifications that are introduced 
into the first-strand cDNAs in the PCR-based methods. 
One approach is based on the poly-A tailing reaction, and 
the other on the template- switching reaction. In principle, 
the goal of poly-A tailing is to obtain both full-length first- 
strand cDNAs and truncated cDNAs. The aim of template 
switching is to obtain first-strand cDNAs that have 
reached the 5' ends of the RNA templates. These modified 
cDNAs are amplifiable by subsequent PGR enrichment 
methods. 

Kurimoto et al reported a quantitative WTA method 
based on the poly-A-tailing reaction for single-cell micro- 
arrays [12]. They used this single-cell transcriptome analy- 
sis, and published initial validation data for technical 
replicates, each of which required 10 pg of total RNA. The 
Pearson correlation coefficient (PCC) for the reproducibil- 
ity of this method using 10 pg of total RNA per reaction 
was approximately 0.85 [12]. Using a method similar to 
the one used by Kurimoto et aL, Tang et al, performed 
single-cell RNA-seq. When they applied their method to a 
single mouse oocyte (around 1 ng of total RNA), these 
researchers were able to detect a larger number of genes 
than could be identified using a microarray approach [13]. 
However, these methods are complicated because they 
require multiple PGR tubes for a single cell, and gel purifi- 
cation is required for the removal of unexpected bypro- 
ducts [18,19]. Furthermore, detailed quantitative analysis 
of the performance of the Tang et al, single-cell RNA-seq 
method, including its reproducibility and sensitivity, has 
not been analyzed. 

Two single-cell RNA-seq methods based on the tem- 
plate-switching reaction have been reported. Islam et al, 
described a method called single-cell tagged reverse tran- 
scription sequencing (STRT-seq), which is a highly multi- 
plexed single-cell RNA-seq method that can detect the 
restricted 5' ends of mRNAs [14]. Ramskold et al, devel- 
oped Smart-Seq (the WTA part of Smart-Seq is now mar- 
keted as SMARTer Ultra Low RNA Kit for lUumina 
Sequencing, Glontech, Mountain View, GA, USA), which 
exhibits a greater read coverage across transcripts than pre- 
viously developed methods [16]. The PGGs for the reprodu- 
cibility of the methods using 10 pg of total RNA were both 
approximately 0.7. Recently, Hashimshony et al, described 
CEL-Seq (Gell Expression by Linear amplification and 
Sequencing), which is an in vitro transcription (I VT) -based 
method but not a PGR-based method. GEL-Seq is a highly 
multiplexed single-cell RNA-seq method that can detect 
the 3' end of mRNA [17]. GEL-Seq was shown to detect 
significantly more genes in single mouse embryonic stem 
(ES) cells compared with STRT-Seq. The performance of 
these reported methods is sufficient for the identification of 
cell-type markers. However, their specifications for WTA 
did not validate whether the methods are sufficient to 



quantitatively assess the global gene-expression heterogene- 
ity that is indicative of cellular heterogeneity. Because the 
PGG for reproducibility is greater than 0.95 for conven- 
tional non-WTA RNA-seq, it would be desirable to 
improve the reproducibility and sensitivity of single-cell 
RNA-seq to a greater degree than is possible with existing 
methods. 

To comprehensively and quantitatively detect gene- 
expression heterogeneity, we have developed a simple and 
highly quantitative single-cell RNA-seq approach that we 
term Quartz-Seq. In this study, we identified some defec- 
tive factors that allowed us to simplify the experimental 
procedures and improve the quantitative performance. 
In particular, to maintain the simplicity and enhance the 
quantitative performance of WTA, we improved three 
crucial aspects: 1) we achieved robust suppression of 
byproduct synthesis; 2) we identified a robust PGR enzyme 
that allows the use of a single-tube reaction; and 3) we 
determined the optimal conditions of reverse transcription 
(RT) and second-strand synthesis for the capturing mRNA 
and the first-strand cDNA. We also performed a quantita- 
tive comparison between our method and previously 
developed methods using 10 pg of total RNA as the start- 
ing material the reproducibility and sensitivity of the 
Quartz-Seq method was better than those of the other 
methods. 

When used in the global expression analysis of real 
single cells, the single-cell Quartz-Seq approach success- 
fully detected gene expression heterogeneity even between 
cells of the same cell type and in the same cell-cycle 
phase. This observed gene-expression heterogeneity was 
found to be highly reproducible in two independent 
experiments, and could be distinguished from experimen- 
tal errors, which were measured through technical repli- 
cates of pooled samples. We also found that single-cell 
Quartz-Seq was able to discriminate more easily between 
different cell types and/or between different cell-cycle 
phases. Therefore, single-cell Quartz-Seq is a useful 
method for the comprehensive identification and quantita- 
tive assessment of cellular heterogeneity. 

Results 

Whole-transcript amplification for single-cell Quartz-Seq 
and Quartz-Chip 

The WTA for Quartz-Seq and Quartz- consists of five main 
steps (Figure 1). The first step is a reverse transcription with 
an RT primer to generate the first-strand cDNAs from the 
target RNAs. The second step is a primer digestion with 
exonuclease I; this is one of the key steps to prevent the 
synthesis of byproducts. The third step is the addition of 
a poly-A tail to the 3' ends of the first-strand cDNAs, and 
the fourth step is the second-strand synthesis using a tagging 
primer, which prepares the substrate for subsequent amplifi- 
cation. The fifth step is a PGR enrichment reaction with 
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Figure 1 Schematic of the single-cell Quartz-Seq and Quartz-Chip methods. All of the steps of the whole-transcript amplification were executed in 
a single PCR tube. The first-strand cDNA was synthesized using the reverse transcription (RT) primer, which contains oligo-dT24, the T7 promoter (T7) 
and the PCR target region (M) sequences. After the first-strand synthesis, the majority of the RT primer was digested by exonuclease I, although it was 
not possible to eliminate the RT primer completely using this procedure. A poly-A tail was then added to the 3' ends of the first-strand cDNA and to 
any surviving RT primer. After the second-strand synthesis with the tagging primer, the resulting cDNA and the byproducts from the surviving primers 
contained the whole-transcript amplification {WTA) adaptor sequences, which include the RT primer sequence and the tagging primer sequence. These 
DMAs were used for the suppression PCR, which used the suppression PCR primer. Enrichment of the short DNA fragments, such as the byproducts, 
was suppressed. After the enrichment, the high-quality cDNA, which did not contain any byproducts, was obtained. The amplified cDNAs then had the 
T7 promoter sequence at the 3' ends of the DNA. These cDNAs were used for the lllumina sequencing and microarray experiments. 
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a suppression PGR primer to ensure that a sufficient quan- 
tity of DNA is obtained for the massively parallel sequencers 
or microarrays. All five steps are completed in a single PGR 
tube without any purification. The amplified cDNA contains 
WTA adaptor sequences from the RT primer and the tag- 
ging primer. 

The amplified cDNA was then used in a massively 
parallel sequencer (Quartz-Seq) and a microarray system 
(Quartz-Ghip). For the Quartz-Seq, the amplified cDNA 
was fragmented using the Govaris shearing system. The 
fragmented cDNA was ligated to adaptors, which enable 
the multiplex production of paired-end (PE) sequences. 
The DNA sequencing library was analyzed using an Illu- 
mina sequencer. For the Quartz-Ghip method, we 
synthesized labeled cRNA from the amplified cDNA 
using in vitro transcription. The labeled cRNA was used 
for the microarray analysis. 

Performance improvements of whole-transcript 
amplification 

In previous WTA methods based on the poly-A-tailing 
reaction, excessive amounts of byproducts are produced 
(see Additional file 1, Figure SI). These byproducts 
(usually DNA < 200 bp in length) are derived from the RT 
primer. The RT primer is modified by terminal deoxynu- 
cleotidyl transferase, similarly to the first-strand cDNA. 
The modified RT primer then causes synthesis of the 
byproducts [18], and the amplified byproducts need to be 
removed by gel purification [18,19] (see Additional file 2, 
Supplementary note). This gel-purification step for the 
removal of these byproducts increases the complexity of 
the method. The byproducts contain WTA adaptor 
sequences. We found that the byproducts cause early 
saturation of the PGR amplification, and reduce the molar 
ratio between the objective cDNA and the byproducts. 
The contamination rate from the WTA adaptor sequence 
was dramatically increased using the lUumina sequencing 
method (see Additional file 1, Figure S2e,f). 

To overcome this byproduct contamination, the bypro- 
duct synthesis was completely eliminated using a combi- 
nation of exonuclease I treatment, restricted poly-A 
tailing, and an optimized suppression PGR (see Addi- 
tional file 1, Figure S3 and Figure S4). We successfully 
eliminated the synthesis of byproducts using the follow- 
ing three improvement points, thus eliminating the need 
for gel purification. 

The first improvement point is the adjustment of the RT 
primer concentration after the RT procedure described 
above. We used the minimum primer concentration for 
the RT. Moreover, we removed the RT primer by treating 
with exonuclease I, which digests single-strand DNAs such 
as primers. This exonuclease I digestion suppressed the 
synthesis of byproducts (see Additional file 1, Figure S2a). 
However, the primer removal was not complete at this 



point, which is in agreement with the results of a previous 
study [18] (see Additional file 1, Figure SI). The molar 
ratio between the single-cell-level mRNA (0.1 pg; Ensembl 
Mouse Transcript 1817 bp average size) and the RT primer 
is greater than 190,000. Gomplete removal solely by exonu- 
clease I digestion was difficult. The remaining primers were 
then modified by terminal deoxynucleotidyl transferase, 
similarly to the first-strand cDNA; these modified primers 
caused the production of byproducts. 

In the second improvement point, to prevent amplifica- 
tion of the modified primers, we used suppression PGR 
technology. Suppression PGR is very effective in the sup- 
pressing amplification of small-size DNA that contains 
complementary sequences at both ends of the template 
DNA [20]. In suppression PGR, these complementary 
sequences can bind to each other, and the self-bound tem- 
plate DNA forms a 'pan-like' structure. In addition, the 
DNA is not amplified by PGR because the PGR primer 
cannot bind to the template DNA (see Additional file 1, 
Figure S4). The target DNA size of suppression PGR 
depends on the end of the complementary sequences 
(including the length and GG content). We also identified 
a good primer sequence for the suppression PGR, and 
showed that the B-primer effectively suppressed the synth- 
esis of byproducts (see Additional file 1, Figure S2c and 
Figure S4b). 

In the third improvement point, to shorten the length of 
the remaining primers modified by the terminal transfer- 
ase and thus make them targets for suppression PGR, we 
restricted the reaction time of the terminal transferase. 
This restriction suppressed the synthesis of byproducts 
(see Additional file 1, Figure S2b). 

It should be noted that in addition, we found that 
topoisomerase V could suppress byproduct synthesis 
(see Additional file 1, Figure S2d). However, the 
mechanism of byproduct suppression by topoisomerase 
V is not known. Using the combination of the three pre- 
viously discussed improvement points, we successfully 
suppressed the synthesis of byproducts completely in a 
single-tube reaction (see Additional file 1, Figure S2e), 
without using topoisomerase V. 

Furthermore, we selected a robust PGR enzyme that was 
optimal for the single-tube reaction. The use of this DNA 
polymerase (MightyAmp DNA Polymerase (Takara Bio, 
Inc., Tokyo, Japan), also marketed as Terra PGR Direct 
Polymerase (Glontech, Mountain View, GA, USA)) 
improved the yield of cDNA (see Additional file 1, Figure 
S2c and Figure S5) and the reproducibility of the WTA 
replication (see Additional file 1, Figure S2d and Figure 
S5). In addition, by using this DNA polymerase (Might- 
yAmp), the number of PGR cycles in WTA could be 
reduced. 

Moreover, we improved the efficiencies of the RT and 
second-strand synthesis steps to counter the lowered 
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reproducibility of the WTA that occurred as a result of 
the variable efficiencies of these steps. We identified the 
optimal annealing temperature that reduced the variabil- 
ity in the efficiencies of these steps (see Additional file 
1, Figure S2a,d b). Our simplified method enabled us to 
consistently obtain highly reproducible cDNA that was 
optimized for RNA-seq (see Additional file 1, Figure S6). 

Reproducibility and sensitivity of single-cell Quartz-Seq 

We performed single-cell Quartz-Seq with 10 pg of 
diluted total ES-cell RNA to validate the reproducibility 
of the technical replicates. We prepared a multiplex, PE 
DNA sequencing library from the amplified cDNA pro- 
duced from the 10 pg of total RNA. The DNA sequen- 
cing library was analyzed using a massively parallel 
sequencer (HiSeq 1000/2000; Illumina). 

Pairwise comparisons of the products of triplicate 
amplifications were used to quantify the reproducibility 
of the protocol based on the PCCs (Figure 2a) in logio- 
transformed fragments per kilobase of transcript per 
million fragments sequenced (FKPM). The PCCs of 
these comparisons were approximately 0.93. We 
counted highly reproducible expressed transcripts with 
FPKM greater than 1.0 and that exhibited less than two- 
fold expression changes between technical replicates. 
The single-cell Quartz-Seq method was capable of 
reproducibly detecting a (mean ± SD) 8,110.3 ± 100.8 
(82.1 ± 0.6%) of 9,872.6 ± 54.4 transcripts (Figure 2b; 
for pairwise plots, see Additional file 3, Figure S7, and 
for linear regression and correlation analysis, see Addi- 
tional file 4, Table SI). 

To evaluate the sensitivity of Quartz-Seq, we com- 
pared the results of conventional RNA-seq (non-WTA) 
and Quartz-Seq. The PCCs of these comparisons were 
approximately 0.89, Figure 2a). We also counted highly 
sensitive transcripts that had FPKM greater than 1 and 
exhibited less than two-fold expression changes between 
technical replicates. The single-cell Quartz-Seq method 
was capable of sensitively detecting 6,605 ± 139.9 (68.1 
± 0.5%) of 9,686 ± 63.7 transcripts (Figure 2b). 

To evaluate the over-representation of the sequences 
derived from the WTA and the library preparation, we 
searched for the WTA adaptor sequence in all of the 
sequence reads using sequence similarity (see details in 
Materials and methods). Using Smart-Seq, 21.1 ± 3.05% of 
the sequences were identified as WTA adaptor (Smart-Seq, 
10 pg ES-cell DNA, PE 30 million reads, n = 4; see Addi- 
tional file 1, Figure S8) whereas 7.68 ± 0.66% of the 
sequences were identified as WTA adaptors by Quartz-Seq 
(Quartz-Seq, 10 pg ES-cell DNA, PE 60 million reads, n = 3; 
see Additional file 1, Figure S8). 

We also evaluated the number of reads required for 
the method to detect mRNAs. From 0.01, 0.1, 0.5 1.0, 5.0, 
10, 30, and 45 million reads (uniquely mapped reads. 



single-ended, 50 bp), we counted the number of detected 
genes and calculated the PCCs between different samples 
of the same origin (10 pg of total RNA). We found that a 
Quartz-Seq result from more than 1 million reads had a 
correlation of greater than 0.9 and detected more than 
7,642 ± 40 transcripts (73 ± 0.19%) (see Additional file 1, 
Figure S9). 

Furthermore, we compared the cDNA lengths resulting 
from the Quartz-Seq and conventional RNA-seq meth- 
ods (see Additional file 1, Figure SIO). The median of the 
read coverage across the expressed transcripts (FPKM > 
10) was 53.8% (705 bp) for Quartz-Seq compared with 
84.8% (1,326 bp) using conventional RNA-seq. 

Comparison between Quartz-Seq and other methods 

We carefully compared the quantitative performance of 
Quartz-Seq/Chip and three reported methods for single- 
cell transcriptome analysis. To compare the reproducibility 
of Quartz-Seq and Smart-Seq, four Smart-Seq data sets 
from the first Smart-Seq paper published by Ramskold 
et al were downloaded from the National Center for Bio- 
technology (NCBI) Gene Expression Omnibus (GEO) 
repository (GSE38495). We calculated the PCC between 
pairs of samples using 10 pg of total RNA (see Additional 
file 3, Figure S7; see Additional file 4, Table SI). The read 
numbers for the Quartz-Seq and Smart-Seq methods were 
adjusted to approximately 30 million reads of single-end 
sequences of 50 bp. The PCC values for Quartz-Seq and 
Smart-Seq were approximately 0.93 and 0.7, respectively 
(Figure 3a). 

We counted highly reproducible transcripts that had 
FPKM or RPKM (reads per kilobase of exon model per 
million mapped) greater than 1 and exhibited less than 
two-fold expression changes between technical replicates. 
Quartz-Seq detected 8,110.3 ± 100 of 9,872.6 ± 54.4 tran- 
scripts (82.1 ± 0.6%), whereas Smart-Seq detected 2,745.5 
± 87.2 of 5,286.6 ± 92.0 transcripts (51.9 ± 0.8%) (human 
brain), 2,312 of 4,202 transcripts (55.0%) (mouse brain), 
and 3,706.0 ± 173.5 of 7,388.4 ± 256.2 transcripts (50.1 ± 
1.5%) (universal human reference RNA). To confirm the 
experimental reproducibility, we performed additional 
Quartz-Seq and Smart-Seq procedures using 10 pg of 
diluted total RNA from an ES cell (n = 4). The read num- 
bers for Quartz-Seq and Smart-Seq were adjusted to 
approximately 30 million reads of PE sequences of 50 bp. 
In this comparison, the PCC of Quartz-Seq was approxi- 
mately 0.93, whereas the PCC of Smart-Seq was approxi- 
mately 0.72 (Figure 3a). Quartz-Seq detected 7,739 ± 38.5 
of 9,506.1 ± 22.7 transcripts (81.4 ± 0.5%), whereas 
Smart-Seq detected 2,320.6 ± 34.7 of 3675 ± 100.6 tran- 
scripts (63.1 ± 1.3%. 

To evaluate the sensitivity of Quartz-Seq and Smart-Seq, 
we compared the results of conventional RNA-seq 
(non-WTA) and each method from same pooled total 
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Figure 2 Reproducibility and sensitivity of single-cell Quartz-Seq. (a) Representative scatter plot of the gene-expression data from two 
replicate single-cell Quartz-Seq analyses of 10 pg of total embryonic stem (ES)-cell RNA (left panel). The blue line indicates a two-fold change, 
and the red line is a linear regression. Scatter plot of single-cell Quartz-Seq (whole-transcript amplification; WTA) and conventional RNA-seq (non- 
WTA) data using 1 |jg of total ES-cell RNA (right panel), (b) Ratio of detected genes for three replicate Quartz-Seq analyses. Using two different 
independent Quartz-Seq experiments, 82.1% of the genes were detected by (left panel). The right panel shows the ratio of the genes detected 
by Quartz-Seq and conventional RNA-seq; more than 68.2% of the genes were detected by single-cell Quartz-Seq. 



RNA mixture. The PCCs of these comparisons were 
approximately 0.88 for Quartz-Seq and 0.7 for Smart-Seq 
(Figure 3b). We also counted highly sensitive transcripts 
that had FPKM greater than 1 and exhibited less than 
two-fold expression changes between technical replicates. 
Using 10 pg ES total RNA, Smart-Seq was capable of 
detecting 2,906 ± 87.4 of 6,263 ± 149.1 transcripts (46.3 ± 
0.5%) whereas Quartz-Seq detected 6,191.2 ± 58.0 of 9,342 
± 134.7 transcripts (66.2 ± 0.6%). 

Next, we compared the quantitative performance 
between Quartz-Seq and CEL-Seq. CEL-Seq data sets from 
the original paper published by Hashimshony et al were 
downloaded from the NCBI Sequence Read Archive (SRA) 
(SRP014672). We calculated the FCC between pairs of 
samples using 10 pg of C. elegans total RNA (see Additional 
file 3, Figure S7; see Additional file 4, Table SI). The PCCs 



of these comparisons were approximately 0.72. We counted 
highly reproducible expressed transcripts that had greater 
than 1.0 tags per million (tpm) and exhibited less than two- 
fold expression changes between technical replicates. CEL- 
Seq was capable of reproducibly detecting 2,564 ± 183.8 of 
5,196.8 ± 364.9 transcripts (49.3 ± 1.7%). Moreover, we rea- 
nalyzed CEL-Seq data with mouse ES cell. We counted 
highly reproducible expressed transcripts that were greater 
than 1.0 tpm using the data. CEL-Seq detected 4,070.3 ± 
332.4 transcripts in single mouse ES cells {n = 9), whereas 
Quartz-Seq detected 6,069.1 ± 854.9 transcripts in the 
same cell type {n = 35). 

Subsequently, we compared the quantitative perfor- 
mances of our method and other methods based on the 
poly-A-tailing reaction. The detailed quantitative perfor- 
mance of the single-cell RNA-seq method of Tang et al 
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Figure 3 Comparison of the performances of Quartz-Seq, Quartz-Chip and other methods (a) Box plot of Pearson correlation coefficients 
(PCCs) for the technical replication of Quartz-Seq and Smart-Seq with 10 pg diluted total RNA. We reanalyzed the following four original Smart- 
Seq datasets: mouse brain: MB; human brain (Nextera library preparation kit: HB (Nx); Universal human reference RNA: UHRR and UHRR with the 
Nextera library preparation kit: UHRR (Nx). The asterisk indicates the downsampling sequence reads from single-cell Quartz-Seq (paired-end (PE), 
60 million reads, n = 3). (b) Box plot of PCCs between conventional RNA-seq and single-cell RNA-seq methods, (c) Comparison of our current 
and previous methods using 10 pg of total ES-cell RNA with GeneChip. Performance of (left) Quartz-Chip, and (right) the Kurimoto et a\. method. 
The Kurimoto et a\. data were reanalyzed using the original sets. The bar plots in the right panels show the numbers of genes that were 
detected with each method: the gray bars show the total number of genes, and the blue bars indicate the number of detected genes. 
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using 10 pg total RNA has not been analyzed. Therefore, 
we evaluated the performance (reproducibility and sensitiv- 
ity) of the Quartz- Chip and Kurimoto et al methods using 
a chip array (GeneChip; Aff)^metrix Inc., Santa Clara, CA, 
USA) with 10 pg of diluted total RNA. For this compari- 
son, we reanalyzed the original data from Kurimoto et al, 
and compared the results with the Quartz- Chip data. We 
first compared the technical duplicates to quantify the 
reproducibility of both protocols (Figure 3c, upper panels). 
In this analysis, we counted a transcript that had robust 
multi-array averaging (RMA) expression greater than 7.0 
and exhibited less than two-fold expression change 
between technical duplicates. The Quartz-Chip method 
was capable of reproducibly detecting 7,520 of 9,622 tran- 
scripts (78.2%) in the technical duplicates; however, the 
Kurimoto et al method detected 5,224 of 6,976 transcripts 
(74.9%). In addition, 69.8% of the transcripts detected by 
the Kurimoto method were detected by Quartz-Chip. 

We then compared the results of conventional non- 
WTA GeneChip with the Quartz-Chip and the Kurimoto 
et al methods to quantif)^ the sensitivity of both protocols 
(Figure 3c, lower panels). Similarly to the reproducibility 
analysis, we counted a transcript that had RMA expression 
greater than 7.0 and exhibited less than two-fold expres- 
sion changes between the non-WTA GeneChip and the 
WTA samples (either the Quartz-Chip or the Kurimoto et 
al. methods). The Quartz-Chip method was capable of 
sensitively detecting 7,557 of 9,849 transcripts (77%), 
whereas the Kurimoto et al method detected 4,686 of 
7,704 transcripts (60.8%). In addition, 73.7% of the tran- 
scripts detected by the Kurimoto et al method were also 
detected by Quartz-Chip. 

Limitations of Quartz-Seq 

We investigated whether there are specific transcript struc- 
tures that have greater noise or are under-represented. We 
calculated and compared the GC content and cDNA 
lengths of the amplified and unamplified isoforms obtained 
by each single-cell RNA-seq method (see Additional file 1, 
Figure Sll). As expected, we found that the unamplified 
isoforms from Quartz-Seq had a higher GC content (mean 
52.1%) than the amplified isoforms (mean 50.2%). In addi- 
tion, the unamplified isoforms from Quartz-Seq had a 
higher GC content (mean 52.1%) than those from Smart- 
Seq (mean 51.5%). In the analysis of cDNA lengths, we 
found that the unamplified isoforms from Quartz-Seq had 
a shorter cDNA length (mean 1,684.0 bp) than the ampli- 
fied (or detected) isoforms (mean 2558.6 bp). 

We then represented the detailed comparison of techni- 
cal noise between different polymerases. We performed 
Quartz-Seq with Ex Taq DNA polymerase (TaKaRa) 
instead of MightyAmp DNA polymerase because Ex Taq 
DNA polymerase has been used in previous methods 
(Kurimoto et al and of Tang et al) that were based on the 



poly-A-tailing reaction. We calculated the GC contents 
and cDNA lengths of the amplified and unamplified iso- 
forms (see Additional file 1, Figure Sll). Using Quartz- 
Seq, the unamplified isoforms produced by MightyAmp 
had a higher GC content (mean 52.1%) than those pro- 
duced by Ex Taq (mean 51.7%), while the unamplified iso- 
forms produced by MightyAmp had a shorter cDNA 
length (mean 1,684.0 bp) than those produced by Ex Taq 
(mean 2,481.1 bp). 

Single-cell Quartz-Seq detects different cell types 

Heterogeneous cell populations, such as cultured cell 
lines and tissues, are composed of various types of single 
cells, which have different gene-expression patterns. We 
therefore tested whether single-cell Quartz-Seq can dis- 
tinguish between different cell types and whether it can 
detect the differentially expressed genes that are charac- 
teristic of each cell type. We performed single-cell 
Quartz-Seq with 12 mouse ES cells and with 12 primi- 
tive endoderm (PrE) cells that are directly differentiated 
from ES cells [21]. We collected single cells directly into 
PCR tubes using fluorescence-activated cell sorting 
(FACS), as previously reported [22]. In this system, all 
sorted cells were readily discerned as single cells (see 
Additional file 1, Figure S13). We successfully obtained 
amplification products from almost all of these single 
cells (98%, n = 50, see Additional file 1, Figure S13). 
The ES and PrE cells were collected during the Gl 
phase of the cell cycle (see Additional file 1, Figure SI 2). 
Although the cell population that combined cells from 
all cell-cycle phases contained an average of approxi- 
mately 10 pg of total RNA per cell, the cells in Gl 
phase contained only approximately 6 pg of total RNA 
per single cell (see Additional file 1, Figure S14). 

We first performed a cluster analysis of all the tran- 
scripts from all of the samples. The global expression pat- 
terns of the ES and PrE cells were clearly divided into two 
clusters (Figure 4a). A heat map of the ES and PrE marker 
genes and the non-differentially expressed genes is shown 
in Figure 4b. We detected 1,620 and 1,436 differentially 
expressed genes in the ES and PrE cells, respectively. 
These differentially expressed genes included the ES mar- 
ker genes (for example, Nanog, PouSfl, and Fgf4) and the 
PrE marker genes (for example, Gata4, Gata6, and Dab2), 
In addition, these marker genes had clear differential 
expression between the ES and PrE cells. By contrast, the 
non-differentially expressed genes, such as Gnbl I and 
Eeflb2y did not exhibit a differential expression pattern 
(Figure 4b). 

We then validated the expression patterns of the ES and 
PrE marker genes and the non-differentially expressed 
genes using an amplification-free single-cell qPCR method. 
To avoid any amplification bias, we directly detected the 
gene expression from single cells without amplification 
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Figure 4 Single-cell Quartz-Seq detects different cell types. Results of single-cell Quartz-Seq with 12 embryonic stem (ES) cells and 12 primitive 
endoderm (PrE) cells, (a) Clustering of all samples, (b) Heat map of the marker genes for ES and PrE cells and the housekeeping genes. The bar plot in 
the right panel shows the mutual information (Ml); a high degree of Ml indicates high differential expression between two cell states, (c) Verification of 
the expression pattern between single cells using amplification-free single-cell quantitative (q)PCR. The gene-expression data for single cells 
correspond to the ES (n = 96 single ES cells in the Gl phase of the cell cycle), PrE (n = 96 single PrE cells in the Gl phase), ES200 (n = 40 single-cell- 
sized samples from pooled lysis of ES cells in the Gl phase), and PrE200 (n = 40 single-cell-sized samples from pooled lysis of PrE cells in the Gl phase) 
groups. The ES and PrE box plots represent the gene-expression variability, which includes the biological variability and the experimental error, while 
the ES200 and PrE200 box plots represent the gene-expression variability due to experimental error. 



(see details in Materials and methods). The results show 
expression patterns for the ES and PrE cell markers and 
the non-differentially expressed genes that were highly cor- 
related with the single-cell RNA-seq data. The gene- 
expression levels of PouSfl and Zfp42 were dramatically 
decreased in the majority of the single PrE cells. However, 
the gene-expression levels of PouSfl and Zfp42 remained 
high in a small number of single PrE cells (Figure 4c). This 
trend was observed in both the single-cell Quartz-Seq and 
the amplification-free single-cell qPCR methods. 



Single-cell Quartz-Seq detects the different cell-cycle 
phases of a single cell type 

In addition to being a result of different cell types, gene- 
expression heterogeneity can also result from different 
cell-cycle phases. To investigate the performance limits of 
the single-cell Quartz-Seq method, we tested whether this 
method is able to distinguish cell-cycle-dependent hetero- 
geneity among ES cells. 

We performed single-cell Quartz-Seq with ES cells in 
different cell-cycle phases (Gl, S, and G2/M) and then 
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used principal components analysis (PCA) to analyze the 
results. The single PrE cells and the single ES cells formed 
two clearly divided clusters: the ES and PrE clusters (see 
Additional file 1, Figure S15a; see Additional file 5, Supple- 
mentary movie 1). When the single PrE cells were 
excluded, the single ES cells from each cell-cycle phase 
formed three different clusters, although a few single cells 
from the Gl and S phases were close to the G2/M cluster 
(see Additional file 1, Figure S15b; see Additional file 6, 
Supplementary movie 2). As expected, the differences 
between the cell-cycle phases were smaller than the differ- 
ence between the stem cells and the more differentiated 
cells. Despite these smaller differences, the single-cell 
Quartz- Seq method was able to detect the different cell- 
cycle phases within a single cell type. 

Single-cell Quartz-Seq reveals that the gene expression 
fluctuations in a single cell type in the same cell-cycle 
phase 

If the differences associated with the different cell types 
and cell-cycle phases are excluded, there still remains a 
small amount of heterogeneity due to the fluctuations in 
gene expression among single cells. 

To test whether single-cell Quartz-Seq can detect these 
gene-expression fluctuations, we calculated and plotted 
the standard deviations from two independently ampli- 
fied sets of single-cell Quartz-Seq data {n = 12 and n = 8) 
of ES cells in Gl phase (Figure 5a). The PCC of these 
standard deviations from two independent sets was 
approximately 0.85 (Figure 5a). We performed an F-test 
of the equality of the two Quartz-Seq data variances to 
identify reproducible gene-expression fluctuations (false- 
discovery rate (FDR) > 0.6). Variances in 17,064 gene 
expressions were reproducibly observed. These results 
suggest that the gene-expression fluctuations detected by 
single-cell Quartz-Seq are highly reproducible and thus 
not due to experimental errors. 

To further validate the observed gene-expression fluc- 
tuations using an independent experimental method, we 
used amplification-free single-cell qPCR to assess the 
gene expression of nine genes (Figure 5b), which were 
selected according to their gene-expression levels. To 
assess both the gene-expression fluctuations and the 
experimental errors, we prepared samples from single 
cells (for the assessment of the gene-expression fluctua- 
tions) and single-cell-sized samples of pooled cells (for 
the assessment of the experimental errors). We expected 
that, if the gene-expression fluctuations detected by sin- 
gle-ceU Quartz-Seq were not solely due to experimental 
errors, the gene-expression fluctuations detected by the 
single-cell qPCR of the single-cell samples would be 
greater than the experimental errors detected from the 
pooled samples. As expected, we found that the gene- 
expression fluctuations detected by single-cell qPCR for 



the single-cell samples were indeed significantly greater 
than the experimental errors detected from the pooled 
samples (F-test, P<0.001, Figure 5b). This result indi- 
cates that single-cell qPCR can clearly distinguish gene- 
expression fluctuations from experimental errors. 

If the single-cell Quartz-Seq method quantitatively 
detects gene-expression fluctuations, we also expected that 
the gene-expression fluctuations would be highly corre- 
lated between the single-cell Quartz-Seq and the qPCR 
methods. To confirm this, we compared the single-cell 
Quartz-Seq data with the single-cell qPCR data. To com- 
pare these two different platforms (single-cell Quartz-Seq 
and qPCR), we chose to use a relative measure, such as 
coefficient of variation (CV), rather than an absolute mea- 
sure, such as standard deviation. As expected, we found 
that the CVs of the single-cell Quartz-Seq approach were 
highly correlated with those obtained with single-cell 
qPCR (PCC for the CVs was 0.992) (Figure 5c), suggesting 
that the single-cell Quartz-Seq method quantitatively 
detects gene-expression fluctuations. 

To assess the functional features of fluctuated genes, we 
performed over-representation analysis using the Gene 
Ontology and the REACTOME pathway database for sin- 
gle-cell Quartz-Seq with 20 mouse ES cells in Gl phase 
(see Additional file 1, Figure S16). First, we performed 
clustering using PCA to collect groups of similar fluctu- 
ated genes. Each principal component was calculated 
using a hypergeometric test with the Gene Ontology and 
REACTOME pathway database. We found that the chro- 
mosome maintenance, Gl/S-specific transcription, and 
RNA polymerase II transcription pathways were signifi- 
cantly over-represented in PC 1. 

Discussion 

In this study, we established a novel WTA method that is 
optimized for single-cell RNA-seq, and detects gene- 
expression heterogeneity between individual cells. This 
WTA method for single-cell Quartz-Seq is substantially 
easier to perform than other previously developed meth- 
ods that are based on the poly-A-tailing reaction [18,19] 
(see Additional file 1, Figure SI). For example, the Kuri- 
moto et aL method requires approximately 17 PCR tubes 
and 11 reaction steps for a single cell [18], whereas the 
single-cell Quartz-Seq amplification, requires only 1 PCR 
tube and 6 reaction steps per single cell; all of the steps 
are completed in a single PCR tube without any purifica- 
tion. These improvements, which drastically simplify the 
single-cell Quartz-Seq method, will be useful for high- 
throughput production of single-cell preparations. 

In addition to its simplicity, the single-cell Quartz-Seq 
method is highly quantitative (Figure 2). We validated the 
performance of single-cell Quartz-Seq using 10 pg samples 
of purified total RNA prepared from pooled cell popula- 
tions, and found that the quantitative performance of 
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Figure 5 Single-cell Quartz-Seq reveals fluctuations in global gene expression in a single cell type in the same cell-cycle phase (a) The x 

and Y axes represent the standard deviation (SD) of gene expression from the different datasets from single-cell Quartz-Seq with single embryonic 
stem (ES) cells that were all in the Gl phase of the cell cycle (n = 12, n = 8). (b) We detected expression of nine genes {Fnl, Zfp42, Sgkl, Tfrc, Utfl, 
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single-cell Quartz-Seq was better than that of previously 
developed single-cell methods (Figure 3). Moreover, 
Quartz-Seq was useful for the analysis of cell subpopula- 
tions (50 cells containing 300 to 350 pg of total RNA) with 
highly quantitative performance {R = 0.99; see Additional 
file 1, Figure S17). 

Any method based on PGR amplification will have diffi- 
culty amplifying transcripts with an extremely high GC 
content, and thus we would expect these to be under-repre- 
sented in the Quartz-Seq. We performed a detailed com- 
parison of technical noise between different polymerases. 



and found that Quartz-Seq is more robust against high GC 
content when MightyAmp DNA polymerase is used for 
amplification of GG-rich sequences compared with Ex Taq 
DNA polymerase. 

In the analysis of cDNA lengths in each method, we 
found that the unamplified isoforms from Quartz-Seq had 
a shorter cDNA length (mean 1,684.0 bp) than the ampli- 
fied isoforms (mean 2558.6 bp). This seems counterintui- 
tive but can be explained by the principle of massively 
parallel sequencing with WTA, in which a longer cDNA 
generates more reads and therefore can be detected more 
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sensitively than a shorter cDNA (see Additional file 1, 
Figure Slid). The unamplified isoforms from Quartz-Seq 
also had a shorter cDNA length (mean 1,684.0 bp) com- 
pared with the mean length of all of the Ensembl Mouse 
Transcripts (mean 1,817 bp). The unamplified isoforms 
from Quartz- Seq had a significantly shorter cDNA length 
(mean 1,684.0 bp) than those from Smart-Seq (mean 
2,382.0 bp), suggesting that Quartz-Seq is more robust 
against a shorter cDNA length. We also found that Smart- 
Seq was unable to amplify 3,924 ± 124.5 isoforms, whereas 
the number of isoforms that could not be amplified by 
Quartz-Seq was only 1,614 ± 88.9. 

As a result of its higher reproducibility and sensitivity, 
single-cell Quartz-Seq can distinguish not only different 
cell types but also different cell-cycle phases of the same 
cell type. In addition, this method can also comprehen- 
sively detect gene-expression fluctuations within the 
same cell type and cell-cycle phase; these fluctuations 
were highly reproducible in two independent experi- 
ments (Figure 5a,c) and were distinguished from experi- 
mental errors measured from technical replicates of 
pooled samples (Figure 5b). Therefore, our method is 
capable of comprehensively and quantitatively revealing 
gene-expression fluctuations. Such fluctuations can be 
generated by both the intrinsic stochastic nature of gene 
expression and the extrinsic environmental differences 
between cells [5-8]. In fact, it has been reported that 
individual cells in a population of ES cells exhibit fluc- 
tuations in both mRNA and protein expression under 
the same culture conditions (for example, as reported 
for Nanogy Zfp42 Whsc2, Rhox9, and Zscan4) that might 
be associated with different cellular phenotypes 
[1,23-25]. Hence, the single-cell Quartz-Seq approach 
should be useful for the analysis of the roles and 
mechanisms of non-genetic cellular heterogeneity. 

Conclusions 

Single-cell Quartz-Seq is a simplified protocol compared 
with previously established methods based on the poly-A- 
tailing reaction. All of the steps are completed in a single 
PGR tube without any purification. The reproducibility 
and sensitivity of Quartz-Seq were higher than those of 
other single-cell RNA-seq methods. Use of Quartz-Seq in 
technical replicates with 10 pg each of total RNA pro- 
duced a PCC of approximately 0.93, whereas the reprodu- 
cibility of previous methods is approximately 0.7. To 
evaluate the sensitivity of Quartz-Seq, we compared the 
performance of conventional RNA-seq and single-cell 
RNA-seq methods with 10 pg total RNA and found the 
PCC to be approximately 0.88 in Quartz-Seq compared 
with approximately 0.7 in other methods. When used in 
the global expression analysis of real single cells, the sin- 
gle-cell Quartz-Seq approach successfully detected gene- 
expression heterogeneity even between cells of the same 



cell type and/or between different cell-cycle phases. This 
observed gene-expression heterogeneity was found to be 
highly reproducible in two independent experiments, and 
could be distinguished from experimental errors, which 
were measured using technical replicates of pooled sam- 
ples. Therefore, single-cell Quartz-Seq is a useful method 
for the comprehensive identification and quantitative 
assessment of cellular heterogeneity. 

Materials and methods 

Cell culture 

We used EB5 ES cells for preparation of total RNA. This 
cell line is derived from E14tg2a ES cells, in which a blasti- 
cidin-resistance gene disrupts one endogenous PouSfl 
allele. We used 5G6GR ES cells for the single-cell Quartz- 
Seq. This cell line was generated by random integration of 
the linearized Gata6-GR-IRES-Puro vector into EB5 ES 
cells [21]. These cells were cultured on gelatin-coated 
dishes, in the absence of feeder cells and in Glasgow mini- 
mal essential medium (GMEM; Sigma- Aldrich, St Louis, 
MO, USA) supplemented with 10% fetal calf serum, 1000 
U/ml leukemia inhibitory factor (ESGRO; Invitrogen 
Corp., Carlsbad, CA, USA), 100 (imol/l 2-mercaptoethanol 
(Nacalai Tesque Inc., Kyoto, Japan), Ix non-essential 
amino acids (Invitrogen), 1 mmol/1 sodium pyruvate (Invi- 
trogen), 2 mmol/1 L-glutamine (Nacalai Tesque), 0.5x 
penicillin/streptomycin (Invitrogen), and 10 (ig/ml blastici- 
din (Invitrogen). For culture of 5G6GR ES cells, 0.5 (ig/ml 
puromycin (Sigma- Aldrich) was also added to the culture. 
For differentiation of 5G6GR ES cells into PrE cells, the 
cells were seeded into medium supplemented with 100 
mmol/1 dexamethasone instead of blasticidin, and cultured 
for 72 hours. After this 72 hours culture, the 5G6GR cells 
had completely differentiated into PrE cells. 

RNA preparation 

The total RNA was purified from the ES cells (EB5 cell 
line) using reagent (TRIzol; Life Technologies Corp., 
Carlsbad, CA, USA) and a commercial kit (RNeasy Mini 
Kit; Qiagen Inc., Valencia, CA, USA). The amount of total 
RNA from an ES cell was quantified using an absorpti- 
ometer (ND-1000; LMS, Tokyo, Japan). The length distri- 
bution of the total RNA was measured using a RNA 6000 
Nano Kit (Agilent Biotechnology, Santa Clara, CA, USA), 
which produced an RNA integrity number of 10 for the 
total RNA. The spike RNAs were synthesized using the 
pGIBS-LYS, pGIBS-DAP, pGIBS-PHE, and pGIBS-THR 
plasmids (American Type Culture Collection (ATCC), 
Manassas, VA, USA) and the MEGAscript T3 kit (Ambion 
Inc., Austin, TX, USA), as previously reported [12]. The 
spike RNAs were added to the total RNA from the ES 
cells as follows (per 10 pg of total RNA): Lys, 1000 copies; 
Dap, 100 copies; Phe, 20 copies; and Thr, 5 copies. The 
total RNA containing the spike RNAs was diluted to 
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25 pg/(il using single-cell lysis buffer (0.5% NP40, Thermo) 
immediately before amplification. 

Single-cell collection using FACS 

The cultured cells were dissociated using trypsin-EDTA 
at 37°C for 3 minutes. The resulting cells were subse- 
quently washed with PBS buffer. A total of 0.5 x 10^ cells 
were stained with 1 ml of PBS containing 10 (ig/ml 
Hoechst 33342 at 37°C for 15 minutes. The stained cells 
were sorted as previously reported, based on the Hoechst 
33342-stained cell area of the FACS distribution [22]. To 
increase the amplification success rate, we added several 
bubbles to the single-cell lysis buffer using a micropipette 
(see Additional file 1, Figure S13). We sorted each single 
cell into a 0.4 [d aliquot of lysis buffer with a bubble; the 
buffer was pre-chilled to 0°C using a PGR chill rack (Iso- 
Freeze; Labgene Scientific, Chatel-St-Denis, Switzerland). 
Subsequently, we performed WTA with each single-cell 
lysis sample. For details of all of the samples used, see 
Additional file 7, Table S2. 

Whole-transcript amplification for single-cell Quartz-Seq 

To reduce the risk of RNase contamination, the work- 
bench environment and all experimental equipment were 
cleaned using an RNase removal reagent (RNase Out; 
Molecular BioProducts, San Diego, CA, USA). We used 
low-retention single PGR tubes or sets of 8-linked PGR 
tubes for single-cell amplifications (TaKaRa Bio Inc., Otsu, 
Japan). The cells and 10 pg total RNA samples were dis- 
solved in 0.4 [i\ of single-cell lysis buffer (0.5% NP40) in an 
aluminum PGR rack at 0°G and transferred to ice. These 
solutions were mixed using a bench-top mixer (MixMate; 
Eppendorf, Westbury, NY, USA) at 2,500 rpm and 4°G for 
15 seconds and then at 3000 x g and 4°G for 10 seconds. 
Immediately after the second centrifugation, 0.8 \i\ of 
priming buffer (1.5x PGR buffer with MgGl2 (TaKaRa 
Bio), 41.67 pmol/1 of the RT primer (HPLG-purified; 
Table 1), 4 U/(il of RNase inhibitor (RNasin Plus; Promega 
Gorp., Madison, WI, USA), and 50 (imol/l dNTPs were 
added to each tube. The solutions were mixed at 2,500 
rpm and 4°G for 15 seconds. The denaturation and prim- 
ing were performed at 70°G for 90 seconds and 35°G for 
15 seconds using a thermal cycler (GIOOO and SIOOO; Bio- 
Rad Laboratories, Inc., Hercules, GA, USA), and the reac- 
tion tubes were placed into an aluminum PGR rack at 0°G. 
Subsequently, 0.8 [i\ of RT buffer (Ix PGR buffer, 25 U/[i\ 
reverse transcriptase (Superscript III; Life Technologies), 
and 12.5 mmol/1 DTT) was added to each tube. The 
reverse transcription was performed at 35°G for 5 minutes 
and 45°G for 20 minutes, and the reactions were heat-inac- 
tivated at 70°G for 10 minutes. The reaction tubes were 
then placed into an aluminum PGR rack at 0°G. We con- 
sistently used the latest available lots of the reverse tran- 
scriptase (Superscript III) for the single-cell amplification. 



After centrifugation at 3000 x g and 4°G for 10 seconds, 1 
[i\ of the exonuclease solution (Ix Exonuclease buffer and 
1.5 U/(il exonuclease I; both TaKaRa Bio) was added to 
each tube. The primer digestion was performed at 37°G 
for 30 minutes, and the reactions were heat-inactivated at 
80°G for 10 minutes. The reaction tubes were placed into 
an aluminum PGR rack at 0°G. After centrifugation at 
3,000 X g and 4°G for 30 seconds, 2.5 [A of poly-A-tailing 
buffer (Ix PGR buffer, 3 mmol/1 dATP, 33.6 U/[A terminal 
transferase (Roche Applied Science, Indianapolis, IN, 
USA), and 0.048 U/[i\ RNase H (Invitrogen) was added to 
each tube in the aluminum PGR rack at 0°G. The reaction 
tubes were mixed at 2,500 rpm and 4°G for 15 seconds. 
Immediately after centrifugation at 3000 x g and 0°G for 
10 seconds, the reaction tubes were placed into a thermal 
cycler block, which was pre-chilled to 0°G. Subsequently, 
the poly-A-tailing reaction was performed at 37°G for 50 
seconds and heat-inactivated at 65°G for 10 minutes. The 
reaction tubes were then placed into an aluminum PGR 
rack at 0°G. After centrifugation at 3000 x g and 4°G for 
30 seconds, the reaction tubes were placed into an alumi- 
num PGR rack at 0°G. We then added 23 [i\ of the second- 
strand buffer (1.09x MightyAmp Buffer v2 (TaKaRa), 
70 pmol/1 tagging primer (HPLG-purified; Table 1), and 
0.054 U/(il MightyAmp DNA polymerase (TaKaRa)) to 
each tube. The reaction tubes were mixed at 2500 rpm 
and 4°G for 15 seconds. After centrifugation at 3000 x g 
and 4°G for 10 seconds, the second-strand synthesis was 
performed at 98°G for 130 seconds, 40°G for 1 minute, and 
68°G for 5 minutes. The reaction tubes were then immedi- 
ately placed into an aluminum PGR rack at 0°G, and 25 [A 
of PGR buffer (Ix MightyAmp Buffer version 2 and 1.9 
(imol/l suppression PGR primer (HPLG-purified; Table 1)) 
was added. The reaction tubes were mixed at 2,500 rpm 
and 4°G for 15 seconds. After centrifugation at 3000 x g 
and 4°G for 10 seconds, the PGR enrichment was per- 
formed using the following conditions per cycle for a total 
of 21 PGR cycles: 98°G for 10 seconds, 65°G for 15 sec- 
onds, and 68°G for 5 minutes. After the PGR step, the 
reaction tubes were incubated at 68°G for 5 minutes. The 
reaction tubes were then placed into an aluminum PGR 
rack at 25°G. The amplified cDNA was purified using a 
PGR purification column (MinElute; Qiagen) or a PGR 
purification bead system (Agencourt AMPure XP; Beck- 
man Goulter Inc., Brea, GA, USA). The obtained amplified 
cDNA was used for subsequent detection by each 
platform. 

For the Smart-Seq analysis, we amplified cDNA from 
10 pg of the total RNA from ES cells using a commer- 
cial kit (SMARTer Ultra Low RNA Kit for lUumina 
sequencing; Glontech, Mountain View, GA, USA). After 
19 cycles of PGR enrichment from the 10 pg of total 
ES-cell RNA, 2 to 3 ng of amplified cDNA were 
obtained. 
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Sequence 5'->' 

TATAGAATOGCGGCCGCTCGCGATAATACGACTCACTATAGGGCG I I I I I I I I I I I I I I I I I I I I I I I I 
TATAGAATOGCGGCCGCTCGCGAI I I I I I I I I I I I I I I I I I I I I I I I 
(NH2)-GTATAGAATOGCGGCCGCTCGCGAT 

AATGATACGGCGACCACCGAGATCTACACTCmCCCTACACGACGCTCTOCGATC^T 
(5'-phosphate)GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTOTGC^*G 
(5'-phosphate)GATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCAATCTCGTATGCCGTC^CTGC^*G 
(5'-phosphate)GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTOTGC^*G 
(5'-phosphate)GATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATGCCGTC^CTGC^*G 
(5'-phosphate)GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCATCTCGTATGCCGTC^CTGOT*G 
(5'-phosphate)GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTOTAATCTCGTATGCCGTC^CTGC^*G 
AATGATACGGCGACCACCGA^G 
CAAGCAGAAGACGGCATACGA^G 

TATAGAATCGCGGCCGCTCGCGATAATACGACTCACTATAGGGCG I I I I I I I I I I I I I I I I I I I I I I I I 



Table 1 Primers used in the various reactions 

Primer 

RT primer {WTA) 
Tagging primer 
Suppression primer 
TRSU 
TRSI-2 
TRSI-4 
TRSI-5 
TRSI-6 
TRSI-7 
TRSI-12 
TPC1 
TPC2 

RT primer (qPCR) 

qPCR, quantiative PCR, WTA, whole-transcript amplification. 
^Indicates phosphorothioate bonds 

Library preparation for single-cell Quartz-Seq 

We prepared a library for conventional RNA-seq (with 
non-WTA) using a commercial kit (TruSeq RNA Sample 
Kit; Illumina Inc., San Diego, CA, USA) in accordance 
with the manufacturer's protocol with the exception of 
the PCR enrichment, for which we used a different poly- 
merase (HiFi DNA polymerase; Kapa Biosystems Inc., 
Woburn, MA, USA). 

To prepare the library for Quartz-Seq (with WTA) and 
Smart-Seq (with WTA), we prepared a DNA sequencing 
library using our optimized library preparation method, 
which we call ligation-based Illumina multiplex library 
preparation (LIMprep). For LIMprep, we used the same 
library preparation kit as before (Kapa Biosystems) and 
self-produced the TruSeq (Illumina) adaptors and PCR 
primers. For Quartz-Seq, 20 ng of amplified cDNA was 
diluted in 130 \i\ of Tris-EDTA (TE) buffer. The solutions 
were transferred into snap-cap microtubes (Covaris Inc., 
Woburn, MA, USA). The amplified cDNAs in the micro- 
tubes were fragmented using a focused ultrasonicator 
(model S220; Covaris). The ultrasonication process was 
configured as follows: duty factor 10%; peak incident 
power 175 W; cycles per burst 100; and time 600 seconds. 
The fragmented cDNA was purified into 10 [A of nucle- 
ase-free water using a concentrating column (DNA Clean 
and Concentrator-5; Zymo Research, Irvine, CA. USA). 
Subsequently, 40 [A of the reaction mix (1.25x End Repair 
Buffer and 1.25 x End Repair Enzyme Mix) (Kapa Biosys- 
tems) was added to 10 [i\ of the fragmented cDNA solu- 
tion. The end-repair reaction was performed at 20°C for 
30 minutes. The end-repaired DNA was purified into 
12.5 [A of EBl/10 buffer (1 mmol/1 Tris-HCl pH 8.0) using 
a concentrating column as before (DNA Clean and 
Concentrator-5; Zymo Research). Subsequently, 12.5 [i\ of 
the A-tailing mix (2x A-tailing Buffer and 2x A-tailing 
Enzyme) was added to 12.5 [d of the end-repaired DNA 



solution. The A-tailing reaction was performed at 30°C for 
30 minutes. The A-tailed DNA was purified into 12.5 \i\ of 
EBl/10 buffer using a concentrating column as before 
(DNA Clean and Concentrator-5; Zymo Research). Subse- 
quently, 12.5 [i\ of the adaptor ligation mix (2x ligation 
buffer, 2x DNA ligase, and 10 pmol of each self-produced 
adaptor) was added to 12.5 [i\ of the A-tailed DNA solu- 
tion at 4°C. The adaptor ligation was performed at 20°C 
for 15 minutes. For the adaptor ligation, we used 10 pmol 
of the self-produced TruSeq adaptor per sample. Each 
self-produced TruSeq adaptor was prepared using the fol- 
lowing HPLC-purified primers (Hokkaido System Science 
Co. Ltd., Sapporo, Japan): TRSU, TRSI-2, TRSI-4, TRSI-5, 
TRSI-6, TRSI-7, and TRSI-12 (Table 1). 

Each primer was dissolved in the adaptor buffer (10 
mmol/1 Tris-HCl pH 7.8, 0.1 mmol/1 EDTA pH 8.0, and 
50 mmol/1 NaCl) to a concentration of 100 (imol/l. Equal 
amounts of 100 (imol/l TRSU and 100 (imol/l of each 
TRSI primer were added to the PCR tubes. After mixing, 
these primers were incubated at 95°C for 2 minutes. The 
primer annealing was then performed (95°C for 2 min- 
utes, followed by a temperature decrease of -0.5°C per 
cycle for 170 cycles). Subsequently, the reaction tubes 
were incubated at 4°C for 5 minutes. The resulting adap- 
tors were diluted with adaptor buffer to a concentration 
of 10 (imol/l. We prepared 1 [i\ aliquots with 10 [i° of 
each adaptor, which were stored at -80°C until use. The 
removal of the adaptor dimer was performed as follows: 
25 (il of binding support buffer (1 mol/1 NaCl, 20 mmol/1 
MgCl2, and 20 mmol/1 Tris-HCl pH 7.8) and 60 [i\ of 
bead solution (Agencourt AMPure XP; Beckman 
Coulter) was added to 25 \i\ of the adaptor-ligated DNA 
solution. After 15 minutes of incubation at 25°C, the 
beads were separated using a magnetic stand for at 
least 10 minutes. Then washed twice with 80% ethanol 
for 1 minute. The adaptor-ligated DNA was eluted 
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with 25 \i\ of EBl/10. The purification step for the 
removal of the adaptor dimer was repeated. Finally, the 
adaptor-ligated DNA was eluted with 20 [i\ of EBl/10. 
A volume of 30 (il of the PGR solution (1.666x HiFi 
DNA polymerase ready mix (Kapa Biosystems), 17.5 
pmol TPCl primer, and 17.5 pmol TPC2 primer) was 
added to 20 [i\ of the adaptor-ligated DNA. The primer 
sequences are shown in Table 1. Each primer was dis- 
solved in the adaptor buffer (10 mmol/1 Tris-HCl pH 
7.8, 0.1 mmol/1 EDTA pH 8.0, and 50 mmol/1 NaCl) to 
a concentration of 100 (imol/l. Prior to PGR enrichment, 
the reaction tubes were incubated at 98°C for 45 sec- 
onds, and the PGR enrichment was then performed (98° 
C for 15 seconds, 60°C for 30 seconds and 72°C for 30 
seconds per cycle). Typically, 10 to 12 PGR cycles were 
used. After the PGR enrichment, the DNA sequencing 
library was purified (Agencourt Ampure XP beads; 
Beckman Goulter). The concentrations of the DNA 
sequencing library, including the TruSeq Index 
sequence, were estimated using a library quantification 
kit (Kapa Biosystems). The PGG for three technical 
replicates of the LIMprep protocol was 0.9976 ± 0.0005. 

GeneChip 

The cDNA was synthesized from 0.25 (ig of total RNA 
using random six-mers (Promega) and reverse transcrip- 
tase (Superscript II; Invitrogen) in accordance with the 
lUumina standard protocol. The cDNA synthesis, cRNA 
labeling reactions, and hybridization to the high-density 
oligonucleotide arrays for Mus musculus (Mouse Genome 
430 Array; Affymetrix Inc., Santa Glara, GA, USA) were 
performed in accordance with the instructions detailed in 
the manufacturer's instructions (Expression Analysis 
Technical Manual; Affymetrix). For single-cell Quartz- 
Ghip, 10 ng of amplified cDNA was used in the cRNA 
labeling reactions. The expression values of the Kurimoto 
et al. and the Quartz-Ghip methods were quantified using 
the RMA method. All of the data were normalized using 
the quantile normalization method to compare the expres- 
sion values between the different microarrays [26] . 

Bioinformatics analysis 

All raw sequencing reads were trimmed using Trimmo- 
matic software to remove the sequencing and WTA pri- 
mers. All of the trimmed sequence reads were mapped to 
the mouse reference genome (mm9) using TopHat (ver- 
sion 2.0.3 [27]) with the default parameters. FPKMs were 
calculated using Gufflinks (version2.0.1 [28]) with a tran- 
scriptome reference (Ensembl Mouse Transcript). The 
sample clustering of the single-cell RNA-seq data was per- 
formed using R software and the pvclust package [29] with 
1000 bootstrap resamplings and Ward distance functions. 
All of the data were visualized using R, the ggplot2 pack- 
age, and the cummeRbund software. 



To identify the significant differential expression 
between two differentiated cell states from single-cell 
RNA-seq data, we performed the Wilcoxon rank test, 
and calculated the mutual information between the gene- 
expression distributions of the ES cells and the PrE cells 
using an empirical Bayes estimator. 

Although our novel single-cell RNA-seq method is 
highly sensitive, cost-effective, and easy to perform, it is 
not completely without amplification bias. To identify 
the differentially expressed genes from the single-cell 
RNA-seq method with small sample datasets, we inferred 
mutual information between the gene-expression distri- 
butions of the two types of cells using an empirical Bayes 
estimator: the mutual information MI(X, Y) for pairs of 
cell states x and Y, where x and Y may, for example, 
represent expression levels of the two cell states. The MI 
is considered the KuUback-Leibler distance from the joint 
probability density to the product of the marginal prob- 
ability densities as follows: 

The MI is always non-negative, symmetric, and equal to 
0 only if x and Y are independent. The MI can be repre- 
sented as a summation of entropies: 

To infer the entropy from gene-expression data with a 
small sample size, we applied an empirical Bayes approach, 
namely, the so-called James-Stein estimator [30]. First, the 
gene-expression data were made discrete using the Freed- 
man and Diaconis algorithm to determine the number of 
bins and the width of the histograms. We then estimated 
the K2 cell frequencies of the K x K contingency table for 
each cell-state pair x and Y using the James- Stein estimator. 
Finally, we calculated H(X), H(Y), H(X, Y) and MI(X, Y). 

To define the reproducibility of the variation in the mea- 
sured expression levels between the single cells analyzed 
using Quartz- Seq, we used FDR to apply the F-test to two 
independent Quartz-Seq datasets with multi-testing 
adjustments. 

Accession codes 

All data are available on GEO [GSE42268]. 

qPCR validation of reverse-transcribed cDNA and 
quantification of amplified cDNA 

To validate the quantification of the reverse-transcribed 
cDNA and the amplified cDNA, we detected the gene 
expression using a SYBR Green system (Power SYBR 
Green PGR Master Mix; Applied Biosystems, Foster Gity, 
GA, USA) and qPGR primers as described previously [12]. 
For the qPGR primer sets for each gene, see Additional 
file 8, Table S3. 

The non-WTA sample was prepared by reverse tran- 
scription using 200 ng of total ES-cell (EB5 cell line) RNA 
containing the spike RNAs {Lys, Dap, Phe, and Thr), 
A sample of 200 ng total RNA was dissolved in 10 (il of 
single-cell lysis buffer (0.5% NP40) on an aluminum PGR 
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S4: Optimization of suppression PGR for Quartz-Seq. Figure S5: Optimal 
DNA polymerase for whole-transcript amplification. Figure S6: Quality 
check of the library preparation for single-cell Quartz-Seq. Figure S8: 
Percentage of sequence reads of the suppression PGR primer or rRNA. 
Figure S9: Relationship between the read number and the 
reproducibility. Figure S10: Optimization of cDNA length in technical 
development for single-cell Quartz-Seq. Figure S11: Trend of 
unamplified isoforms in each single-cell RNA-seq method. Figure SI 2: 
Amplified cDNA lengths resulting from single-cell RNA-seq methods. 
Figure SI 3: Success rate of whole-transcript amplification from single 
cells sorted by fluorescence-activated cell sorting (FAGS). Figure SI 4: 
Amount of total RNA from a single cell at each cell-cycle phase. Figure 
SI 5: Principal component analysis (PGA) of single cells from different cell 
types at different cell-cycle phases. Figure SI 6: Over-representation 
analyses for principal component (PG) of single cells from same cell 
types in the same cell-cycle phase (Gl). Figure SI 7: Scatter plots of 
conventional RNA-seq and Quartz-Seq using 50 ES cells in the Gl phase 
of the cell cycle and Quartz-Seq using 10 pg of total ES RNA. Figure 
SI 8: Effect of carried-over buffer for PGR efficiency. 

Additional file 2: Supplementary note. 

Additional file 3: Figure S7: All scatter plots 

Additional file 4: Table SI. All results of linear regression and 
correlation analyses. 

Additional file 5: Supplementary movie 1. Principal component 
analysis (PGA) with single-cell Quartz-Seq data of embryonic stem (ES) 
and primitive endoderm (PrE) single-cell preparations. 

Additional file 6: Supplementary movie 2. Principal component 
analysis (PGA) with single-cell Quartz-Seq data of embryonic stem (ES) 
cells in different cell-cycle phases. 

Additional file 7: Table S2. Sequencing information. 

Additional file 8: Table S3. Primer information. 



rack at 0°C and placed on ice. Immediately after, 20 [i\ of 
priming buffer (1.5x PGR buffer with MgCl2 (TaKaRa), 
41.67 pmol/1 of RT primer (HPLC-purified; Table 1), 4 U/ 
[A RNase (RNasin Plus; Promega), and 50 (imol/l dNTPs) 
was added to each tube. The solution was mixed at 2,500 
rpm and 4°C for 15 seconds. The denaturation and prim- 
ing were performed at 70°C for 90 seconds and 35°C for 
15 seconds using a thermal cycler (ClOOO and SIOOO; Bio- 
Rad Laboratories), and the reaction tubes were placed in 
an aluminum PGR rack at 0°C, then 20 \i\ of RT buffer (Ix 
PGR buffer, 25 U/(il Superscript III (Life Technologies), 
and 12.5 mmol/1 DTT) was added to each tube. The RT 
was performed at 35°C for 5 minutes and 45°C for 18 min- 
utes and then heat-inactivated at 70°C for 10 minutes. 
Subsequently, the reaction tubes were placed in an alumi- 
num PGR rack at 0°C. 

Amplification-free single-cell qPCR 

We used a semi-skirted 96-well PGR plate for the amplifi- 
cation-free single-cell qPCR. Single cells in the Gl cell- 
cycle phase were individually collected into 1 \i\ of lysis 
buffer (0.25% NP40 and 1 U/(il RNasin (Promega) plus 
RNase inhibitor) at 4°G in a PGR chill rack (IsoFreeze; 
Labgene Scientifid). After centrifugation at 2000 x g and 
4°G for 2 minutes, these solutions were mixed at 2,000 
rpm for 15 seconds. To detect the variability of the reverse 
transcription of each gene, we prepared an 'averaged' sin- 
gle-cell pooled sample. For the pooled sample, 200 single 
cells in the Gl phase were collected into 200 [i\ of lysis 
buffer. After mixing, 1 [i\ of the pooled lysis solution was 
dispensed into each well of a 96-well PGR plate. Subse- 
quently, 2 [i\ of RT buffer (1.5x VILO Reaction Buffer 
(contains random primers) and 1.5 x Superscript Enzyme 
Mix; both Invitrogen) was added to each well. These plates 
were incubated as follows: 25°G for 10 minutes, 42°G for 
60 minutes, and 85°G for 10 minutes. Then 15 \i\ of the 
qPGR dilution solution (0.0001% NP40) was added and 
mixed well. Mouse genomic DNA was used to normalize 
the qPGR quantification. To a 384-well qPGR plate, we 
added 7 (il of the qPGR solution (1.4x QuantiTect SYBR 
Green PGR Master Mix (Qiagen), 5 pmol forward primer, 
and 5 pmol reverse primer) and 3 \i\ of diluted solution 
were added The qPGR plate was incubated at 95°G for 15 
minutes. Subsequently, qPGR was performed for 45 cycles, 
which consisted of 95°G for 15 seconds, and 60°G for 1 
minute. The data were collected at 60°G. For the primer 
sets for each gene, see Additional file 8, Table S3. 

Additional material 



Additional file 1: Figure SI: Schematic of the whole-transcript 
amplification methods based on the poly-A-tailing reaction. Figure S2: 
Improvement parameters of whole-transcript amplification for Quartz- 
Seq. Figure S3: Key steps for robust suppression of byproducts. Figure 
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